Streaming Min-max Hypergraph Partitioning

نویسندگان

  • Dan Alistarh
  • Jennifer Iglesias
  • Milan Vojnovic
چکیده

In many applications, the data is of rich structure that can be represented by a hypergraph, where the data items are represented by vertices and the associations among items are represented by hyperedges. Equivalently, we are given an input bipartite graph with two types of vertices: items, and associations (which we refer to as topics). We consider the problem of partitioning the set of items into a given number of components such that the maximum number of topics covered by a component is minimized. This is a clustering problem with various applications, e.g. partitioning of a set of information objects such as documents, images, and videos, and load balancing in the context of modern computation platforms. In this paper, we focus on the streaming computation model for this problem, in which items arrive online one at a time and each item must be assigned irrevocably to a component at its arrival time. Motivated by scalability requirements, we focus on the class of streaming computation algorithms with memory limited to be at most linear in the number of components. We show that a greedy assignment strategy is able to recover a hidden co-clustering of items under a natural set of recovery conditions. We also report results of an extensive empirical evaluation, which demonstrate that this greedy strategy yields superior performance when compared with alternative approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approximation Algorithms for Independent Set Problems on Hypergraphs

This thesis deals with approximation algorithms for the Maximum Independent Set and the Minimum Hitting Set problems on hypergraphs. As a hypergraph is a generalization of a graph, the question is whether the best known approximations on graphs can be extended to hypergraphs. We consider greedy, local search and partitioning algorithms. We introduce a general technique, called shrinkage reducti...

متن کامل

High Quality Hypergraph Partitioning via Max-Flow-Min-Cut Computations

In this thesis, we introduce a framework based on Max-Flow-Min-Cut computations for improving balanced k-way partitions of hypergraphs. Currently, variations of the FM heuristic [17] are used as local search algorithms in all state-of-the-art multilevel hypergraph partitioners. Such move-based heuristics have the disadvantage that they only incorporate local information about the problem struct...

متن کامل

Network Flow-Based Refinement for Multilevel Hypergraph Partitioning

We present a refinement framework for multilevel hypergraph partitioning that uses max-flow computations on pairs of blocks to improve the solution quality of a k-way partition. The framework generalizes the flow-based improvement algorithm of KaFFPa from graphs to hypergraphs and is integrated into the hypergraph partitioner KaHyPar. By reducing the size of hypergraph flow networks, improving ...

متن کامل

On solving Mincut Balanced Circuit Partitioning Problem for Digital Circuit Layout using Evolutionary Approach with Solution Archive

The interest in finding an optimal partition in the area of VLSI has been a hot issue in recent years. Circuit Partitioning Problem is one of the most studied NP complete problems notable for its broad spectrum of applicability in digital circuit layout. The balanced constraint is an important constraint that obtains an area balanced layout without compromising the mincut objective. This paper ...

متن کامل

Continuous bottleneck tree partitioning problems

We study continuous partitioning problems on tree network spaces whose edges and nodes are points in Euclidean spaces. A continuous partition of this space into p connected components is a collection of p subtrees, such that no pair of them intersect at more than one point, and their union is the tree space. An edge-partition is a continuous partition de3ned by selecting p − 1 cut points along ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015